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Abstract 

After generalizing the concept of clusters to incorporate clusters that are linked to other clusters through some 
relatively narrow bridges, an approach for detecting patches of separation between these clusters is developed based 
on an agglomerative clustering, more specifically the single-linkage, applied to one-dimensional slices obtained from 
respective feature spaces. The potential of this method is illustrated with respect to the analyses of clusterless uniform 
and normal distributions of points, as well as a one-dimensional clustering model characterized by two intervals with 
high density of points separated by a less dense interstice. This partial clustering method is then considered as a means 
of feature selection and cluster identification, and two simple but potentially effective respective methods are described 
and illustrated with respect to some hypothetical situations. 


‘Ogni blocco di pietra ha una statua dentro di se ed e compito 
dello scultore scoprirla.’ 

Michelangelo. 


1 Introduction 

Grouping entities into abstract categories, or clustering 
(e.g. mu ed, constitutes one of the most fundamental 
and intrinsic human activities. By doing so, it is possi¬ 
ble to avoid the explosion of labels that would be other¬ 
wise implied by individual identification of each possible 
entity. The often considered basic grouping rationale is 
that the entities in each category would share several fea¬ 
tures, while differing from entities in other categories. So, 
emphasis is placed on identifying these common and dis¬ 
tinguishing features. For simplicity’s sake, we will call 
this basic type of groups as granular clusters. 

Continuing application of clustering principles by hu¬ 
mans ultimately gave rise to language, arts, and scientific 
modeling. Indeed, each word can be understood a model 
corresponding to a respective category of objects, actions, 
etc. (e.g. 0 ) The great importance of clustering, as well 
as the challenging of achieving it computationally, has 
been reflected in the relatively large number of related 
approaches reported in the literature (e.g. mi0)- 

Many of these approaches have been based, or relate to, 


the above mentioned basic grouping principle. In other 
words, it is often expected that, when mapped into fea¬ 
ture spaces, distinct categories will give rise to relatively 
well-separated compact groups of points. In other words, 
large inter-group scattering and small intra-group scatter¬ 
ing are generally expected. 

Another related, but somewhat complementary ap¬ 
proach is to understand as clusters groups of points that 
can be well-separated through hypersurfaces, or separa- 
trices. These two approaches differ in the important sense 
that the existence of a separatrix does not necessarily re¬ 
quire the granular principle, while the latter usually en¬ 
sures the former. In a sense, the separatrix principle can 
then be understood as being less strict than the granular 
counterpart. 

In both cases, it is often assumed that each cluster is 
completely segregated from all the other clusters, therefore 
constituting separated , isolated groups. Even when some 
level of overlap is present, frequently one still aims at ob¬ 
taining completely separated clusters. This not so often 
realized assumption implies an important characteristic, 
namely that the cluster identification is a global activity, 
in the sense that the identification of groups or separations 
between them is performed while taking into account the 
whole ‘border’ of each group, along all possible directions. 
In other words, this type of requirement acquires a topo¬ 
logical aspect regarding the isolation between the involved 
clusters. Though contributing to a more complete char- 
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acterization of the clusters, this requirement implies con¬ 
ceptual and computational demands that are often hard 
to be met by respective algorithms. 

One of the motivations of the present work is to allow 
groups of points that are evidently not separated from the 
remainder groups to be considered as a kind of generalized 
clusters, characterizing by the existence of incomplete sep¬ 
aration between the respective groups, in the sense that 
two groups can be linked even through ‘solid’, but rel¬ 
atively narrow bridges, while still being separated along 
longer border extensions. These clusters are henceforth 
called, paradoxically, linked clusters , corresponding to a 
generalization of the concept of cluster. As a consequence, 
focus needs to be placed on identifying the existing sepa¬ 
ration extensions , even if in terms of respective patches. 

The basic rationale is that any identification of separa¬ 
tion patches between two adjacent groups is intrinsically 
important, even if incomplete. Indeed, the complete seg¬ 
regation of clusters can be though as the integration of 
many separation patches. A related approach, known as 
multi-view clustering (e.g. 0), considers separating sets 
of features (multi-views) as the means of improving clus¬ 
tering. Here, we resort to slicing the feature space through 
narrow hypercylinders aligned along each of the existing 
features as the means of obtaining separation patches be¬ 
tween clusters, which can provide valuable information 
about the relationships between sets of points in the orig¬ 
inal feature space. 

Patched clustering approaches can also be used to de¬ 
vise potentially effective methods for feature selection, in 
the sense that if a given feature is found to contribute to 
several local separations, it would also be a good candi¬ 
date to contribute to respective complete segregations. 

The pathed clustering approach allows some interest¬ 
ing features, such as the possibility to search for partial 
separations by considering lower dimensional samples of 
the original feature spaces. Though other related ap¬ 
proaches could be adopted, here we focus on the idea 
of one-dimensional slicing (or probing) of feature spaces 
through hypercylinders as a means for identification of 
patches of separation between clusters. This approach is 
potentially interesting because of its relative low compu¬ 
tational complexity, as well as for its ability to probe the 
feature space clustering structure in a more ‘surgical’ and 
independent way, tending to avoid interactions and pro¬ 
jections between clusters that could otherwise be more 
intense in the case of higher-dimensional approaches. 

For all these reasons, the present work develops the 
concepts of partial separation as a way to performing gen¬ 
eralized clustering. Observe that the concepts of partial 
clusters and one-dimensional identification separating in¬ 
terstices can be thought as going hand in hand. 

This article starts by discussing the concept of general¬ 


ized clustering to include linked clusters, and follows by 
presenting the adopted one-dimensional patched cluster¬ 
ing approach, which involves the single-linkage agglom- 
erative method. Then, the application of these concepts 
to deriving a simple and yet potentially effective feature 
selection methodology and to develop of a simple local 
clustering algorithm are then presented and illustrated. 

2 Generalized Clusters and Sepa¬ 
rations 

The first central issue regards what we understand by 
a cluster or separation , which tend to be dual concepts. 
We have already observe that, in the respective litera¬ 
ture, a cluster is often defined as a set of entities that are 
similar one another while being different from other enti¬ 
ties, as illustrated in Figure [lja), which we have called a 
granular cluster. All the examples in this figure assume 
a two-dimensional feature space defined by hypothetical 
measurements /i and / 2 . 



(C) (d) 



(e) (f) 


Figure 1: Six distinct types of clusters and respective separations: 
(a) granular cluster; (b) separatrix cluster; (c) surrounded cluster (a 
special case of separatrix cluster); (d) a linked cluster; (e) another 
type of linked cluster; and (f) a separatrix cluster with fuzzy separa¬ 
tion. The continuous areas can also be understood as corresponding 
to respective distributions of points. 
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Another type of cluster already mentioned is related to 
the concept of separatrix. For instance, the two clusters 
in Figure [ljb) are not granular, but can be divided by 
a separatrix. We have called this type of cluster a sep¬ 
aratrix cluster. Recall that granular clusters are almost 
invariably separatrix clusters, but not vice-versa. Also, 
observe that the two cluster examples discussed so far can 
be properly separated by projections along some direction 
(in the case into the horizontal axes). 

Figure [ljc) illustrates another type of separatrix clus¬ 
ter that is not a granular cluster and that cannot be iso¬ 
lated by a linear separatrix: the inner cluster is com¬ 
pletely surrounded by the other cluster. As a conse¬ 
quence, it becomes impossible to try identifying the sepa¬ 
ration through consideration of projections along any di¬ 
rection. 

A particularly interesting situation is depicted in Fig¬ 
ure 0d). Here, we have a situation similar to the previ¬ 
ous example, but the inner cluster turns out to be linked 
somehow to the outer cluster through a narrow bridge. 
This type of cluster is henceforth called a linked cluster. 
Though, properly speaking, this could not be taken to be 
a cluster, there are several reasons for extending that con¬ 
cept for such situations. One of them is that the bridge 
could correspond to an artifact implied by noisy or un¬ 
suitable features. Another reason is that the cluster is 
indeed intrinsically linked, but that this link has tran¬ 
sient nature or is likely to be removed, or even, that has 
just been created. Yet another reason is that, even if in¬ 
deed permanently attached, the inner group of points can 
still be considered mostly distinct and different from the 
outer cluster. 

Figure [lje) illustrates another example of linked clus¬ 
ter. Though the left and right-hand sides of the points 
are topologically linked, it is still interesting to know that 
the points to each side of the slit are locallly disconnected, 
which can provide valuable insights about the situation 
under analysis. 

We can also have that the separation interstices are 
not completely void, as in the cases shown in Figure [lja- 
e), but exhibit an intermediate density of points, giving 
rise to fuzzy separations that can be related to overlaps 
between clusters. 

In addition to the above types of clusters, we could also 
have hybrids involving combinations of these types. For 
instance, we can have a circular inner cluster whose border 
involves void separation, fuzzy patches, as well as bridges. 
The possibility of coexisting different types of clusters and 
separations corroborates the difficulty of devising global 
clustering approached capable of identifying this type of 
cluster. 

In the present work, we aim at considering and iden¬ 
tifying all these types of clusters and respective sepa¬ 


rations, yielding a more generalized approach to clus¬ 
tering. In addition to providing valuable information 
about the relationships between the points in different 
regions of the feature space, patched identification of clus¬ 
ters/separations can also be understood as a preliminary 
step to a more global approach, where the locally iden¬ 
tified clusters/separations are integrated into larger, po¬ 
tentially complete clusters and separatrices. 

3 One-Dimensional Clustering 

The issue of one-dimensional clustering, though intrinsi¬ 
cally fundamental and interesting, has received relatively 
little attention when compared to multi-dimensional 
counterparts. One possible reason is that the characteri¬ 
zation of entities can rarely be comprehensive when just 
one measurement is adopted. However, it follows from the 
previous discussions that one-dimensional exploration of 
entities mapped into feature spaces is intrinsically valu¬ 
able as a subsidy for finding relevant features and patched 
indications of clustering and interstices. More specifically, 
every indication of clustering, even if detected by a ID 
slicing of the feature space, is important and welcomed. 

In this section, we discuss the problem of one¬ 
dimensional clustering in terms of agglomerative ap¬ 
proaches (e.g. 0 ) , in particular the single-linkage method. 
Compared to other methods such as k— means, agglomer¬ 
ative methods have an intrinsic desirable feature, namely 
its inherent ability to represent and characterization data 
separation in terms of several spatial scales, yielding a 
respective dendrogram from which the data can be di¬ 
vided in any number of categories (by considering specific 
clustering heights). This allows not only a more com¬ 
plete characterization of the clustering along these scales, 
but also provides the means for identifying the relevance 
of each potential cluster in terms of the length of its re¬ 
spective branch. In addition, adaptive schemes can be 
considered in which the resolution of the separation can 
be progressively increased along spatial scales, such as 
zooming (e.g. ED- 

In the single-linkage method, groups are progressively 
merged in terms of the nearest distance between them. 
Recall that the minimum distance between two sets of 
points corresponds to the minimum distance between any 
of respective pairs of points. The choice of this method, 
which is know to be susceptible to the phenomenon of 
chaining , among many other agglomerative possibilites, 
was justified by preliminary experiments in which the sin¬ 
gle linkage tended to be more likely to reject clusters in 
the case of uniform and normal distributions of points. 
In addition, this method is particularly simple regarding 
conceptual and computational aspects. 
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A discussion of one-dimensional clustering can benefit 
greatly from having models of cluster structures in one- 
dimension. We start by considering a situation not in¬ 
volving clustering (other than for statistical fluctuations), 
namely the case of points scattered in a randomly uniform 
manner along a one-dimensional domain x. 

It can be shown that the nearest neighbor distance be¬ 
tween such points is given by the exponential probability 
distribution 

p(d) = Xe~ xd ( 1 ) 

whose average is 1 /A. 

Let’s consider the single linkage method, which progres¬ 
sively joins clusters based on the their nearest distance, 
corresponding to the smallest distance between any point 
of each pair of clusters. Interestingly, because of the ad¬ 
jacency implied by ID domains, we have that merging 
can only take place among adjacent clusters. As a conse¬ 
quence, the successive differences of clustering height also 
follows Equation [l] 

Figure [ 2 ] shows the dendrogram obtained by single- 
linkage agglomerative clustering of a uniformly scattered 
set of points with A = 500 distributed in the interval 
0 < x < 1 . 


Cluster Dendrogram 



Figure 2: Dendrogram obtained by single-linkage agglomerative 
clustering of a set of points with A = 500 uniformly distributed 
along the one-dimensional internval 0 < x < 1 Observe the absence 
of any relevant cluster indications. 

As expected, no significant cluster is suggested by this 
dendrogram, given the relatively short extension of the 
the branches leading to relatively large groups of points. 
Also, observe that the total agglomeration is reached at 
about 5 times the average expected nearest neighbor dis¬ 
tance of 1 /A = 1/500 = 0 . 002 . 

Figure [3] depicts the dendrogram obtained for a nor¬ 
mally distributed set of points with means equal to zero 
and standard deviation equal to 0.1. This type of points 
distribution is often observed in practice, being character¬ 
ized by a gradual increase of point density near the mean. 
The obtained dendrogram exhibits the chaining effect and 
does not provide any indication of relevant clusters. 

Let’s now consider a prototype of one-dimensional clus¬ 
ter consisting of two regions of points uniformly dis- 



Figure 3: Dendrogram obtained by single-linkage agglomerative 
clustering of a set of points with A = 500 normally distributed with 
zero means and standard deviation equal to 0.1. The dendrogram, 
which exhibits the chaining effect, provides no indication about any 
relevant cluster. 

tributed with larger intensities Ai and A 3 along respective 
intervals 0 < x < x\ and x 2 < x < 1 , while a more sparse 
distribution with A 2 < Ai is implemented in the interval 
x\ < x < X 2 . Figure [4] illustrates this configuration. 



X 


Figure 4: Scattering of points (above), containing two clusters, ob¬ 
tained by sampling the three intervals in the adopted model with 
Ai = 140, and A 2 = 20 and A 3 = 120, which are associated to the 
uniform probability density shown below. 

The two groups of points arises as a consequence of the 
less intense distribution of points existing in the interme¬ 
diate interval with length Ax = X 2 — xi, which we will 
henceforth call an interstice , among the two groups cor¬ 
responding to the other two denser intervals. Figure [4] 
illustrates one example of this clustering model with re¬ 
spect to x\ = 0.4, X 2 = 0.6, Ai = 140, and A 2 = 20 
and A 3 = 120. The respectively obtained dendrogram, by 
using single-linkage, is shown in Figure [5] 

Unlike the dendrogram obtained previously, now we 
have an evident indication of the existence of two clusters, 
characterized by long branches leading to groups with a 
substantial number of points, corresponding to the two ex¬ 
pected clusters. Observe that the lenghth of the two main 
clusters is much larger than the average nearest neighbor 
distance of 1/280 « 0.003571 that would have been ob¬ 
served for a completely uniform scattering of 280 points 
in the interval 0 < x < 1 . 

Though we develop our two-modal clustering model 
based on uniform distributions of points, similar prop¬ 
erties could be expected when each cluster follows other 
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Figure 5: The dendrogram obtained by single-linkage clustering of a 
the considered one-dimensional cluster structure shown in Figure^] 
The two main clusters are effectively identified despite the relatively 
small separation Ax = X2 — x± = 0 .2. Observe the streaks at the 
right-hand side, which corresponds mostly to the points in the in¬ 
terstice. The relevance of existence of clustering in this dendrogram 
obtained by using the proposed algorithm was p « 0.7453, with the 
length of the second longest branch, streaks ignored, corresponding 
to « 0.034. 

types of statistical distributions, such as normal. Also, 
observe that this models considers the situation in which 
the interstice is not completely void. 

The points belonging to interstice tend to appear as 
long, narrow branches leading to relatively small number 
of points. These branches, which we will henceforth call 
streaks , need to be ignored while detecting the more rele¬ 
vant clusters. There are several possible means to do so, 
but here we resort to the following simple method: 

Starting from the root (k = 2 clusters), cut the den¬ 
drogram at successive number of groupe k. For each of 
these cases, consider only those branches leading to at 
least n = N/a points, where N is the total number of 
points and 0 < a < 1 is a parameter. This allows most 
of the streaks to be pruned. Stop when the number of 
remaining branches is smaller than 2. The length i of the 
cluster at the interruption time is divided by the largest 
height value H and taken as an indication of the relevance 
p = jj of the existence of clustering in the original set of 
points, with 0<P<1- 

Observe that the above algorithm tends to work even 
if more than two main well-defined clusters are present in 
the original data, provided they have size larger than n. 

By using this algorithm with a = 4, we obtain a rele¬ 
vance p « 0.7453 for the dendrogram in Figure [5] 

The potential of this simple approach has been sup¬ 
ported in all the examples above, and also by many other 
configurations not included in this article. Also, observe 
that this method has potential for properly identifying all 
the types of clusters/separations in Figure [l] This can be 
in part understood as a consequence of the ability of the 
adopted one-dimensional clustering approach to empha¬ 
size the existence of two adjacent concentrated densities 
of points, being relatively less affected by the type of sep¬ 
aration. Another reason for the potential effectiveness of 


this method is its ability to focus on local separations, 
therefore avoiding interferences of points in other regions 
or orientations, something that can hardly be achieved by 
methods based on projections. 

The above discussion and results suggests that this 
method can be adopted for two important tasks, namely 
feature selection and cluster detection, which are respec¬ 
tively addressed in the following sections. 

4 Feature Selection 

Feature selection ( e.g. 018]) is important both for su¬ 
pervised and unsupervised pattern recognition. Even in 
deep learning, when feature selection is often understood 
to be part of the learning dynamics and not expected to 
be pre-defined, the identification of particularly effective 
features can be of interest while trying to understand how 
the classification was achieved. 

Basically, given M features of any type, feature selec¬ 
tion aims at identifying those that contribute more deci¬ 
sively to the proper classification of the existing groups. 
As a consequence of the importance of feature selection, a 
relatively large number of approaches has been reported 
in the literature (e.g. 00 ). 

Two main types of feature selection approaches are of¬ 
ten identified: (i) those which consider the result of the 
classification itself as an indication of the relevance of spe¬ 
cific sets of features, which is often called wrapper ; and (ii) 
those, called filter , in which this relevance is inferred by 
indirect methods, such as in terms of scattering distances 
between the existing clusters or correlations between the 
features. 

Here, we develop a simple filter approach in which each 
of the M features defines respective one-dimensional slic¬ 
ings of the feature space. For generality’s sake, and in 
order to consider more information and more points in 
each slice than could be otherwise obtained infinitesi¬ 
mally, we consider that each of these straight slices cor¬ 
respond to hypercylinders with radius r in the respective 
M-dimensional feature space. 

More specifically, the simple feature selection approach 
suggested here consists of: for each feature fi, i = 
1,2,..., M, fix all other feature values as fj = fj = 
constant , j ^ i, which defines the line (the hypercilin- 
der axis) Li : (/x = /x ,/ 2 = / 2 /m = /m), 
with fi t min < fi < fi,max ■ Identify all marked points in 
the feature space that are at a maximum distance r from 
Li, given as 


M 2 

r = E (ft - fj) ( 2 ) 

\ 77 
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The fi~ coordinates of each of these points define the 
one-dimensional signal x to be analyzed. 

Relatively high relevance values p obtained at least for 
some instances of the one-dimensional sliced signals x , 
can be potentially taken as indication of feature fi be¬ 
ing relevant for consideration in respective clustering ap¬ 
proaches. 

As a consequence of its ‘surgical’ operation in the fea¬ 
ture space, this method is potentially capable of estimat¬ 
ing the contribution that each feature can provide even 
with respect to less frequently considered types of inter- 
stices/borders between clusters. In addition, given that it 
does not involve global projections, the potential of each 
feature can be better assessed regarding its possible con¬ 
tribution to separating groups. 

In order to illustrate the potential of this simple ap¬ 
proach, we consider the situation in which we have squares 
with side 7 and circles with radius 7 , with 7 uniformly 
distributed in the interval 1 < 7 < 2 . 

Let’s consider the following 5 features for the charac¬ 
terization of these shapes: (i) 7 , uniformly distributed for 
both circles and squares; (ii) perimeter (p = 2tts for cir¬ 
cles and p = 4s for squares; (iii) area (a = 1 rs 2 for circles 
and s 2 for squares); (iv) relative perimeter r = p/s (2i r 
for circles and 4 for squares); and (v) circularity c = 

(1 for circles and J for squares). The two latter mea¬ 
surements are expected to contribute more effectively to 
the separation between circles and squares, as they do 
not depend on their respective scales defined by respec¬ 
tive varying values of a and present different values for 
the two types of shapes. 

Each of these obtained features was respectively stan¬ 
dardized, i.e. 

. feature - (mean of feature) . 

new feature = --——— : — : -—-- (3) 

(standard deviation of feature) 

which ensures that each normalized feature has means 
equal to zero and standard deviation equal to 1. In addi¬ 
tion, most of the normalized values fall within the interval 
[— 2 , 2 ], which allows us to define the region of interest in 
the feature space as — 2 < fi < 2 for i = 1 , 2 , ..., M. 

Uniformly distributed noise in the range [—0.2,0.2] was 
added to each feature, after standardization. The slices 
were taken for all permutations of uniformly spaced fea¬ 
ture values of — 2 ,- 2 , 0 , 1,2 and the radius of the hyper¬ 
cylinder was set as r = 2 , and a = 4. Only slices x 
containing more than 150 points were considered by the 
one-dimensional clustering approach. 

Figure [ 6 ] shows the histogram of the number of points in 
each of the sliced x signals, with an average of 940.34. We 
have that the number of points in most of the analyzed 
one-dimensional scatterings of points x correspond to ap¬ 


Table 1: The number of occurrences in which clusters have been 
identified, the average relevance, and the product of these two val¬ 
ues, respectively to each of the 5 considered features, obtained for 
the set of 5000 circles and squares by using the suggested one¬ 
dimensional feature selection approach. 


feature 

occurrss 

relev. (p) 

(occurs.) (relev.) 

7 

0 

0 

0 

perim. 

1 

0.7467 

0.7467 

area 

2 

0.9137 

1.8276 

rel.perim. 

61 

0.3407 

20.7849 

circ. 

67 

0.3553 

23.8109 


proximately 1/5 of the total of 5000 objects (circles and 
squares). Thinner slicing would require more objects, in 
order to allow more points to fall inside the slicing cylin¬ 
ders. 



500 1000 1500 2000 2500 

number of points in x 

Figure 6: Histogram of the number of points in the analyzed one¬ 
dimensional slices of the feature space containing circles and squares. 

Table [l] presents, for each of the considered features, 
the number of occurrences of successful cluster identifica¬ 
tion (i.e. cases in which at least two well-defined clusters 
were identified), the average relevance, and the respective 
product of these two values. 

All the obtained indicators suggest that the 7 , p and 
a measurements were not effective, while r and c yielded 
not only many identification occurrences, but also a rela¬ 
tively high (in the order of 1/3) height of the shorter main 
cluster, leading to the highest relevance values p, which 
turned out to be similar. This is in agreement with what 
could be expected from the type of considered shapes and 
the properties of the adopted measurements, indicating 
the potential of the proposed method for feature selec¬ 
tion. 


6 


































5 Cluster Detection 


In addition to applications to feature selection, the one- 
dimensional slicing method reported in this article can 
also be considered for cluster detection approaches. The 
basic idea is that patches of the interstices can be iden¬ 
tified by the one-dimensional method and incrementally 
merged so as to obtain longer, more complete separation 
regions. 

Many approaches can be tried, but here we consider 
only one approach focusing on the identification of patches 
of interstices between potentially existing clusters. The 
method, which is straightforward, involves performing 
several one-dimensional slices and, in case clusters are 
identified, to identify the respective interstice and to mark 
them in the original feature space. 

Given an obtaine one-dimensional signal x with two 
identified clusters, respective interstice patches can be es¬ 
timated by identifying the limits of the two main clusters. 
More specifically, the interstice can be understood as cor¬ 
responding to the region between the largest x —value of 
the left-hand side cluster and the smallest x —value of the 
right-hand sided cluster, yielding the interval x a < x < 
x b - 

In order to illustrate this simple method for interstice 
identification, we consider the two-dimensional feature 
space shown in Figure J7^a), containing 1000 points or¬ 
ganized into two elongated clusters in a two-dimensional 
feature space defined by two hypothetical features /i and 
/ 2 - 

One-dimensional clustering was performed respectively 
to several slices taken along the x— axis. More specifi¬ 
cally, we considered y = 0, 0.01, 0.02,..., 1, a = 4 and 
r = 0.1, while only slices containing more than 100 points 
were taken into account. In each case of relevant cluster 
indication, the interstice was identified and marked into 
the feature space. The result, given in Figure [T^b), corre¬ 
sponds to a proper identification of the separation region 
in the 2D space. 


6 Concluding Remarks 

More traditional clustering approaches, mostly oriented 
toward the identification of granular and separatrix 
groups, aim at obtaining well-separated (often topolog¬ 
ically) groups. These methods tend to have an inherent 
topologically global nature, in the sense of each cluster 
being considered with respect to all other adjacent por¬ 
tions and directions of the feature space. Despite their 
potential for achieving more complete cluster identifica¬ 
tion, the global requirement can impose severe conceptual 
and computational demands on respective methods. 
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Figure 7: Two elongated clusters defined in a two-dimensional fea¬ 
ture space (a), and detection of the respective interstice (in light 
green) by using the suggested one-dimensional clustering approach 

(b). 


The present work main objective has been to extend the 
definition of clusters in order to include situations where 
each group can be partially attached to other clusters, 
giving rise to the paradoxical concept of linked cluster. 
Such a generalized understanding of clusters immediately 
implies the need for methods to complement more tradi¬ 
tional approaches aimed at obtaining completely detached 
groups. It has been argued here that the proper patched 
identification of clusters and respective interstices can be, 
in principle, performed with simplicity, low computational 
cost, and relative efficiency by using the reported one¬ 
dimensional single-linkage agglomerative method. 

A simple procedure has been outlined for, given a one¬ 
dimensional slice signal x, to look for existence of clus¬ 
ters, returning as result a relevance index 0 < p < 1 
corresponding to the length of the second longest dendro¬ 
gram branch divided by the largest heigh value. This al¬ 
gorithms was shown to perform properly for uniform and 
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normal distribution of points devoid of clusters, as well 
as for a clustering model involving two intervals with rel¬ 
atively high density of points separated by a less intense 
density taking place along an interstice. 

This simple one-dimensional method for identifying the 
relevance of clusters along sliced signals was then applied 
to derive an algorithm for feature selection. The basic idea 
was to obtain several one-dimensional probes along each 
feature taken in turn, placed at varying configurations 
of the other remaining features, and to consider those 
features leading to higher relevance of clusters existence 
as being potentially more effective to be selected as fea¬ 
tures in diverse clustering algorithms. This approach was 
successfully illustrated with respect to a hypothetical set 
of data including circles and squares with uniformly dis¬ 
tributed radius/sides. The suggested method was capable 
of identifying, among the 5 considered features, the two 
alternative that were potentially more relevant for dis¬ 
criminating between the considered objects. 

The same one-dimensional method for cluster identifi¬ 
cation in signals sliced from feature spaces was then briefly 
considered as the basis for developing cluster detection al¬ 
gorithms. A simple method was suggested which involves 
the identification of the coordinate of the interstices along 
the sliced signals. This method was successfully applied 
to a simple two-dimensional clustering problem, illustrat¬ 
ing its operation and potential. 

Though illustrated with respect to relatively simple sit¬ 
uations, the proposed methods have potential for good 
performance in several types of real-world problems. Ad¬ 
ditional experiments considering higher dimensional fea¬ 
ture spaces, as well as several types of features and num¬ 
ber of classes and individuals, could provide a better 
indication about the characteristics of the proposed ap¬ 
proaches. 

All in all, we hoped to have described the following 
main contributions: (a) generalize the concept of clus¬ 
ter to incorporate partial clusters; (b) to describe a sim¬ 
ple and effective one-dimensional method for local cluster 
identification in slices obtained from feature spaces; (c) 
to illustrate the particular suitability of the single-linkage 
agglomerative approach for this finality; (d) to derive a 
simple and potentially effective method for feature selec¬ 
tion; and (e) to outline a simple method for cluster detec¬ 
tion. 

The reported contributions pave the way to several pos¬ 
sible future developments. In particular, it would be inter¬ 
esting to develop more sophisticate and robust cluster de¬ 
tection approaches, e.g. adapted to multiple classes, and 
also considering linear combinations of the original mea¬ 
surements. It would also be interesting to incorporate the 
described one-dimensional clustering approach into more 
global pattern recognition methods possibly underlain by 


optimization regarding the slicing configurations and spa¬ 
tial scales. In addition, the proposed concepts and results 
may also cast some light on the workings of deep learning 
approaches, such as by helping to understand how effec¬ 
tive features can be automatically obtained. 
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